Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Agent] Enhance FileManager and simplify the use of file managers #193

Merged
merged 33 commits into from
Dec 25, 2023

Conversation

Bobholamovic
Copy link
Member

@Bobholamovic Bobholamovic commented Dec 21, 2023

主要改动如下:

  1. 优化FileManager的使用方式。具体而言:对于初级用户,修改全局file manager的使用方式,提供配置全局file manager的功能;对于进阶用户,提供基础的资源管理功能,允许更精细化的资源、对象生命周期控制。以下是对于初级用户的示例用法。用户不需要关心file manager,只需要设置环境变量或全局配置一次。
import os

# 设置环境变量
os.environ["EB_AGENT_ACCESS_TOKEN"] = "xxx"

# 构建和使用各组件,不需要设置access_token,也不需要构建或传递file_manager
# 未设置环境变量

from erniebot_agent.file_io import GlobalFileManagerHandler

async def main():
    # 文件保存在指定位置,且文件夹不会在程序结束后被自动清理
    await GlobalFileManagerHandler().configure(access_token="xxx", save_dir="my/save/dir", prune_on_close=False)
    
    # 构建和使用各组件,不需要构建或传递file_manager
  1. 支持从环境变量读取全局access token等配置项。
  2. 新增File.write_contents_to方法,用于将文件内容转存到本地文件中。
  3. 修复Makefile中的bug。

更多细节详见comments。

@Bobholamovic Bobholamovic marked this pull request as draft December 21, 2023 11:44
@@ -1,6 +1,9 @@
.DEFAULT_GOAL = format lint type_check
.DEFAULT_GOAL = dev
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.DEFAULT_GOAL不能是多目标

from erniebot_agent.utils.logging import logger
from erniebot_agent.utils.mixins import GradioMixin

logger = logging.getLogger(__name__)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

采用统一的logging风格,后同。

@@ -27,15 +28,16 @@
ToolResponse,
)
from erniebot_agent.chat_models.base import ChatModel
from erniebot_agent.file_io import protocol
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

直接使用模块protocol,而不是从中import方法,这样有利于测试时做patch。后同。

if file is None:
raise RuntimeError(f"Unregistered ID {repr(val)} is used by {repr(tool)}.")
elif is_remote_file_id(val):
if protocol.is_file_id(val):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

重构后FileManager不再考虑多FileManager共享一个FileRegistry的情形,这大大简化了我们要处理的情况,例如这里只需要判断FileManager在自己的registry中持有的文件。

@@ -40,14 +40,14 @@ def handlers(self) -> List[CallbackHandler]:

def add_handler(self, handler: CallbackHandler):
if handler in self._handlers:
raise RuntimeError(f"The callback handler {handler} is already registered.")
raise ValueError(f"The callback handler {handler} is already registered.")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按照Python文档中的定义调整异常类型,后同。

erniebot-agent/src/erniebot_agent/file_io/file_registry.py Outdated Show resolved Hide resolved
@@ -0,0 +1,119 @@
import contextlib
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后续可用于模拟文件服务器

class TestAgentResponseAnnotations(unittest.TestCase):
def setUp(self):
class TestAgentResponseAnnotations(unittest.IsolatedAsyncioTestCase):
async def asyncSetUp(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

资源有申请就有释放。后同。

@Bobholamovic Bobholamovic marked this pull request as ready for review December 21, 2023 13:17
@codecov-commenter
Copy link

codecov-commenter commented Dec 21, 2023

Codecov Report

Attention: 148 lines in your changes are missing coverage. Please review.

Comparison is base (854b8e2) 65.07% compared to head (04d6ee1) 64.77%.

Files Patch % Lines
...erniebot_agent/file/global_file_manager_handler.py 34.78% 30 Missing ⚠️
...ebot-agent/src/erniebot_agent/file/file_manager.py 73.33% 28 Missing ⚠️
...iebot-agent/src/erniebot_agent/file/remote_file.py 35.29% 22 Missing ⚠️
erniebot-agent/src/erniebot_agent/utils/mixins.py 55.81% 19 Missing ⚠️
erniebot-agent/src/erniebot_agent/agents/base.py 50.00% 13 Missing ⚠️
...t-agent/src/erniebot_agent/tools/remote_toolkit.py 60.00% 6 Missing ⚠️
erniebot-agent/src/erniebot_agent/file/protocol.py 61.53% 5 Missing ⚠️
erniebot-agent/src/erniebot_agent/utils/logging.py 28.57% 5 Missing ⚠️
...bot-agent/src/erniebot_agent/file/file_registry.py 66.66% 4 Missing ⚠️
erniebot-agent/src/erniebot_agent/file/base.py 57.14% 3 Missing ⚠️
... and 7 more
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #193      +/-   ##
===========================================
- Coverage    65.07%   64.77%   -0.30%     
===========================================
  Files           56       57       +1     
  Lines         2789     2947     +158     
===========================================
+ Hits          1815     1909      +94     
- Misses         974     1038      +64     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Bobholamovic Bobholamovic marked this pull request as draft December 21, 2023 18:54
access_token: Optional[str], save_dir: Optional[str], **opts: Any
) -> FileManager:
if access_token is None:
access_token = C.get_global_access_token()
Copy link
Contributor

@Southpika Southpika Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否需要将eb模型和file在环境变量中的ak进行强绑定呢~用户如果在环境变量中设置了ak的话就无法使用local file了。

不知道有没有一种情况用户只想将ak用于eb的模型调起 但是他这个ak没有开通remote file的服务。

Copy link
Member Author

@Bobholamovic Bobholamovic Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已增加enable_remote_file选项控制是否启用remote file管理功能。

@Bobholamovic Bobholamovic marked this pull request as ready for review December 22, 2023 08:38
else:
continue
agent_files.append(AgentFile(file=file, type=file_type, used_by=tool.tool_name))
raise RuntimeError(f"Unregistered file with ID {repr(val)} is used by {repr(tool)}.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉是不是可以把file type拼进去,用户可以知道是输入的文件出问题还是输出的?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return _global_file_manager


async def configure_global_file_manager(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里需要一个参数来设置使用是否使用remote_file,就是上午说的需要一个开关。用户设置了access token,是希望使用aistudio的tool,而不希望使用remote file。

Copy link
Member Author

@Bobholamovic Bobholamovic Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已增加enable_remote_file选项控制是否启用remote file管理功能。

save_dir: Optional[FilePath] = None,
*,
default_file_type: Optional[Literal["local", "remote"]] = None,
prune_on_close: bool = True,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prune_on_close参数主要什么作用?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个参数可以控制是否在对象close时自动清理那些可以安全删除的文件,如果设置为False的话文件会被保留

return _global_file_manager


async def configure_global_file_manager(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改成set_global_file_manager?

Copy link
Member Author

@Bobholamovic Bobholamovic Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个后来想了下,原因如下:

image

不过刚才的PR有个asyncio的data race问题,需要引入锁,而asyncio.Lock作为模块全局变量可能导致一些问题(例如用户使用自定义的event loop时,锁可能不起作用),为此我改写了一个lazy Singleton,该类需要在异步函数中构造。

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

最新的代码里提供了set方法,用于直接设置FileManager对象。

@Southpika
Copy link
Contributor

更新之后factory的get_file_manager好像没有地方使用了,这个文件可以删除吗

@Bobholamovic
Copy link
Member Author

更新之后factory的get_file_manager好像没有地方使用了,这个文件可以删除吗

已删除

f"A file is used by {repr(tool)}, but the agent has no file manager to fetch it."
raise RuntimeError(
f"Unregistered file with ID {repr(val)} is used by {repr(tool)}."
f" File type: {file_type}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

突然发现在look_up_file_by_id中 如果file为None已经抛出FileError了

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

file = self._file_manager.look_up_file_by_id(val)
if protocol.is_file_id(val):
file_manager = await self._get_file_manager()
file = file_manager.look_up_file_by_id(val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看是否要在这个地方try catch还是让他自然报错

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改为try-catch

@sijunhe sijunhe merged commit 8937548 into PaddlePaddle:develop Dec 25, 2023
6 of 7 checks passed
@Bobholamovic Bobholamovic deleted the agent/feat/add_environs branch December 25, 2023 09:06
skyan pushed a commit to skyan/ERNIE-Bot-SDK that referenced this pull request Dec 26, 2023
…addlePaddle#193)

* Fix makefiles

* Fix bugs

* Enhance file_io

* Update library code

* Update examples

* Update tests

* Fix exceptions

* Fix error info

* Remove use of environment variables in integration tests

* Fix protocol

* Fix type hints

* Fix

* Fix typing

* Fix style

* Fix cleanup bugs

* Fix data race

* Fix bugs

* Show file type

* Fix bugs

* Fix linting issues

* Fix linting issues

* Fix integration tests

* Remove unused file

* Fix and enhance

* Fix style

* Fix CI

* Update CI config

* Fix style

* Remove anchors

* Enhance mixins
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants